翻訳と辞書
Words near each other
・ Voice (film)
・ Voice (grammar)
・ Voice (Hiromi album)
・ Voice (Indiana)
・ Voice (jazz)
・ Voice (Mika Nakashima album)
・ Voice (Neal Schon album)
・ Voice (Perfume song)
・ Voice (phonetics)
・ Voice (Porno Graffitti song)
・ Voice (trade union)
・ Voice acting
・ Voice acting in Japan
・ Voice acting in South Korea
・ Voice acting in the United States
Voice activity detection
・ Voice analysis
・ Voice bangladesh
・ Voice banking
・ Voice box
・ Voice break
・ Voice broadcasting
・ Voice browser
・ Voice call continuity
・ Voice casting
・ Voice change
・ Voice changer
・ Voice chat
・ Voice classification in non-classical music
・ Voice coil


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Voice activity detection : ウィキペディア英語版
Voice activity detection

Voice activity detection (VAD), also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected. The main uses of VAD are in speech coding and speech recognition. It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session: it can avoid unnecessary coding/transmission of silence packets in Voice over Internet Protocol applications, saving on computation and on network bandwidth.
VAD is an important enabling technology for a variety of speech-based applications. Therefore various VAD algorithms have been developed that provide varying features and compromises between latency, sensitivity, accuracy and computational cost. Some VAD algorithms also provide further analysis, for example whether the speech is voiced, unvoiced or sustained. Voice activity detection is usually language independent.
It was first investigated for use on time-assignment speech interpolation (TASI) systems.
== Algorithm overview ==
The typical design of a VAD algorithm is as follows:〔
# There may first be a noise reduction stage, e.g. via ''spectral subtraction''.
# Then some features or quantities are calculated from a section of the input signal.
# A classification rule is applied to classify the section as speech or non-speech – often this classification rule finds when a value exceeds a threshold.
There may be some feedback in this sequence, in which the VAD decision is used to improve the noise estimate in the noise reduction stage, or to adaptively vary the threshold(s). These feedback operations improve the VAD performance in non-stationary noise (i.e. when the noise varies a lot).〔
A representative set of recently published VAD methods formulates the decision rule on a frame by frame basis using instantaneous measures of the divergence distance between speech and noise . The different measures which are used in VAD methods include spectral slope, correlation coefficients, log likelihood ratio, cepstral, weighted cepstral, and modified distance measures .
Independently from the choice of VAD algorithm, we must compromise between having voice detected as noise or noise detected as voice (between false positive and false negative). A VAD operating in a mobile phone must be able to detect speech in the presence of a range of very diverse types of acoustic background noise. In these difficult detection conditions it is often preferable that a VAD should fail-safe, indicating speech detected when the decision is in doubt, to lower the chance of losing speech segments. The biggest difficulty in the detection of speech in this environment is the very low signal-to-noise ratios (SNRs) that are encountered. It may be impossible to distinguish between speech and noise using simple level detection techniques when parts of the speech utterance are buried below the noise.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Voice activity detection」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.